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Abstract 

Many backtracking algorithms exhibit heavy-tailed distributions, in which their running time is often 
much longer than their median. We analyze the behavior of two natural variants of the Davis-Putnam- 
Logemann-Loveland (DPLL) algorithm for Graph 3-Coloring on sparse random graphs G(n,p = c/n). 
Let -Pc(fc) be the probability that DPLL backtracks b times. First, we calculate analytically the probability 
P c (0) that these algorithms find a 3-coloring with no backtracking at all, and show that it goes to zero 
faster than any analytic function as c — > c* = 3.847... Then we show that even in the "easy" phase 
1 < c < c* where P c (0) > 0, including just above the emergence of the giant component, the expected 
number of backtracks is exponentially large with positive probability. To our knowledge this is the first 
rigorous proof that the running time of a natural backtracking algorithm has a heavy tail for graph 
coloring. In addition, we give experimental evidence and heuristic arguments that this tail takes the 
form P c (b) ~ b^ 1 up to an exponential cutoff. 

1 Introduction 

Many common search algorithms for combinatorial problems have been found experimentally to exhibit a 
heavy-tailed distribution in their running times; for instance, in the number of backtracks performed by 
Davis-Putnam-Logemann-Loveland (DPLL) algorithms on constraint satisfaction problems such as Satis- 
fiability, Graph Coloring, and Quasigroup Completion ^2 IS EH E3- ^ n sucn a distribution, with 
significant probability, the running time is much larger than its median, and indeed the expectation can be 
exponentially large even if the median is only polynomial. These distributions typically take a power-law 
form, in which the probability that the algorithm backtracks b times behaves as P c (b) ~ b f for some expo- 
nent 7. One consequence of this is that if a run of the algorithm has taken longer than expected, it is likely 
to take much longer still, and it would be a good idea to restart it (and follow a new random branch of the 
tree) rather than continuing to search in the same part of the search space. 

For Graph 3-Coloring, in particular, these heavy tails were found experimentally by Hogg and Williams J3| 
and Davenport and Tsang [7]. At first, it was thought that this heavy tail indicated that many instances are 
exceptionally hard. A clearer picture emerged when Gomes, Selman and Crato |1 1| found that the running 
times of randomized search algorithms on a typical fixed instance show a heavy tail. In Figure ^ we show our 
own experimental data on the distribution of the number of backtracks for two versions of DPLL described 
below. In both cases the log-log plot follows a straight line, indicating a power law. As n increases, the 
slopes appear to converge to —1, and we conjecture that P c (b) ~ b^ 1 up to some exponential cutoff. 

A fair amount of theoretical work has been done on heavy tails, including optimal restart strategies |14| 
and formal models 5 . However, there have been relatively few rigorous results establishing that these tails 
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Figure 1: Log- log plots of the distribution of the number of backtracks P c (b) for the two DPLL algorithms 
A and B described in the text on random graphs with c = 3.5. The data appears to follow a power law 
P c (b) ~ b^ 1 in the limit n — > oo. 



exist. The most desirable result would be a proof, for some natural probability distribution over problems 
of size n, that P c (b) ~ b ' for some 7 in the limit of large n and b. To our knowledge, no such result has 
been obtained. In this paper, we show a weaker result, namely that b is exponentially large with positive 
probability, even for "easy" random problems where b = with positive probability (and, if P c (0) > 1/2, 
the median value of b is zero). One related result is Achlioptas, Beame, and Molloy ,TJ, who showed using 
lower bounds on resolution proof complexity that DPLL takes exponential time on random instances of 
3-SAT, even for some densities below the satisfiability threshold; our results appear to be the first on Graph 
Coloring, and we rely on much simpler reasoning. 

Our results hold for two variants of DPLL. Both of them are greedy, in the sense that they branch on a 
vertex with the smallest available number of colors; in particular, they perform unit propagation, in which 
any 1-color vertex is immediately assigned that color. They are distinguished by which 2-color vertex they 
branch on when there are no 1-color vertices. In algorithm A, the vertices are given a fixed uniformly random 
ordering, and we branch on the 2-color vertex of lowest index. In algorithm B, we choose a vertex uniformly 
at random from among the 2-color vertices. In both variants, we try the two possible colors of the chosen 
2-color vertex in random order. (How we branch on 3-color vertices is immaterial, since there is always a 1- 
or 2-color vertex while the algorithm is coloring the giant component.) 

Our main result is the following: 

Theorem 1.1 For algorithms A and B, let b be the number of times the algorithm backtracks on G(n, c/n). 
If 1 < c < c* = 3.847..., there exist constants j3,q>0 such that Pr[6 > 2 l3n ] > q, and so E[b] = Q(2 0n ). 

Although this theorem does not show that the tail of P c (b) behaves as we believe our arguments can be 
refined to do that. Along the way, we calculate the precise probability that these algorithms succeed with 
no backtracking at all: 

Theorem 1.2 Let 1 < c < c* = 3.847... For algorithms A and B, the probability the algorithm colors 
G(n,c/n) without backtracking is 

P ^ =GXp (-r df 2(l4(2 + A) ) + ° (1) W 
where A = (2/3)c(l — t — e~ ct ) and to is the smallest positive root of 1 — t — e~ ct = 0. 
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We note below that P G (0) approaches zero faster than any analytic function as c approaches c* , and comment 
on the fact that this "essential singularity" makes it very difficult to locate the threshold at which such 
heuristics succeed using numerical experiments. 

Our work is motivated partly by recent results of Ein-Dor and Monasson Suppose the expected 
amount of backtracking takes the form exp(o;(c)n + o(n)); then, based on an earlier analysis of 3-SAT by 
Cocco and Monasson jS], they estimate to(c) by modeling the search tree with a time-dependent branching 
process. The values of w(c) they obtain using this approach agree very well with experiment, especially when 
the average degree is large. Beame, Culberson, Mitchell and Moore 0] proved for some DPLL algorithms 
that uj(c) = 0(l/c 2 ) in the limit of large c, in agreement with a scaling argument of |2J- However, their 
arguments do not apply as well for small values of c, below the 3-colorability threshold. 

The idea behind Theorem 1 1.1 1 is very simple. Partway down a random branch of the tree, with positive 
probability, the subgraph induced by the remaining vertices contains a small subgraph, which is not list- 
colorable given its remaining colors; say, a triangle composed of the vertices whose available colors are red 
and green. No matter what the algorithm does from that point on, it will encounter this subgraph over 
and over again, vainly recoloring other vertices in the hope that it will go away. Thus every branch of this 
subtree will fail, and the algorithm is forced to backtrack to before this subgraph's neighbors were colored. 
The result is that there is a strong positive correlation between the events that two different branches of the 
search tree fail, and so an exponentially large number of branches can fail even though a given one succeeds 
with positive probability. 

We will rely heavily on the fact that for both these variants of DPLL, a single random branch is equivalent 
to a linear-time greedy heuristic, 3-GL, analyzed by Achlioptas and Molloy 2 . They showed that if 1 < c < c* 
where c* = 3.847... then 3-GL colors G(n, c/n) with positive probability. (If c < 1 then the graph with high 
probability has no bicyclic component and 3-GL colors it with probablity 1.) This shows that P c (0) > 0, 
i.e., with positive probability these variants of DPLL succeed with no backtracking at all. However, as our 
results show, the expected amount of backtracking is exponentially large even for random graphs with c in 
this "easy" regime, and indeed just above the appearance of the giant component at c = 1. 

The paper is organized as follows. In Section|21 we prove Theorem ll.2l bv looking closely at 3-GL using 
the techniques of Achlioptas and Moore grouping the steps of the algorithm into rounds, and exactly 
analyzing the correlations between the 1-color vertices colored in a given round. We also use generating 
functions to calculate the distribution of the number of 1-color vertices at a given time. 

In Section [3] we prove Theorem 11.11 along the lines alluded to above. First we show that a triangle 
of red-green vertices appears with positive probability, dooming an entire subtree; then, we show that for 
both variants of DPLL, with positive probability the number of leaves of this subtree is exponentially large. 
Finally, in Section 0] we conclude and give some intuition about how Theorem ll.il might be strengthened to 
prove that the number of backtracks is distributed as a power law. 

We use red, green, and blue to denote our three colors. All asymptotics are in the limit of large n, and 
we omit floors and ceilings. 

2 The probability of success without backtracking 

2.1 3-GL and differential equations 

Achlioptas and Molloy :> 2. analyzed a greedy list-coloring heuristic they call 3-GL. Each vertex v has a list 
£(v) of available colors, which are removed when they are assigned to its neighbors. We call v a q-color vertex 
if |^(w)| = q and every vertex is 3-color vertex at the beginning. Then 3-GL works as follows: 

1. If there are any 1-color vertices, choose one at random and assign its available color to it. 

2. Else if there are 2-color vertices, choose one v at random, and assign it a random color c £ £{v). 

3. Else choose a 3-color vertex at random and assign a random color to it. 
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Everything outside the giant component of G(n,c/n) with high probability consists of trees and unicyclic 
components, and it is easy to see that 3-GL succeeds on such components. Therefore, we focus on the phase 
of 3-GL which colors the giant component, during which there is always a 1- or 2-color vertex. We refer 
to steps of type (1) and (2) above, in which we color 1-color and 2-color vertices, as "forced" and "free" 
respectively. It will be useful to follow Achlioptas and Moore [3] and group steps into "rounds," where each 
round consists of a free step followed by a cascade of forced steps. 

Since the first branch of both our variants of DPLL is equivalent to a run of 3-GL, the probability P c (0) 
that they color the graph with no backtracking at all is the same as the probability that 3-GL succeeds, i.e., 
that it colors the entire graph without creating a 0-color vertex. This in turn is the probability that all of 
3-GL's rounds succeed. 

Now, define the state of a round as the number of uncolorcd vertices of each color list, i.e., the number of 
3-color vertices and the number of 2-color vertices of each color pair, present at the beginning of that round 
(by definition there are no 1-color vertices present). By the principle of deferred decisions, the uncolored part 
of the graph is uniformly random in G(n' ,p) where n' is the total number of uncolored vertices. Therefore, 
the probability that a given round fails is a function only of its state. Moreover, if we condition on the state 
of each round, the events that various rounds fail become independent, and P c (fi) is simply the product over 
all rounds of the probability that they succeed. 

As it turns out, the probability that a given round succeeds is a continuous function of its state, so to 
calculate P c (0) within o(l) it is sufficient to estimate the state to within o(n). The technique of differential 
equations, and in particular Wormald's theorem [T^], allows us to do this. Let S 2 (R) and S 3 (R) be the 
number of 2- and 3-color vertices at the beginning of the i?'th round. Then, the behavior of 3-GL on 
G(n,c/n) can be modeled with the following set of differential equations in the "rescaled" variables S3 and 
s 2 , where the variable of integration is r = R/n |2JE1 : 

s 3 (0) = 1 

s 2 (0) = (2) 



ds 3 


cs 3 


dr 


1 - A' 


ds 2 


cs 3 - 1 


dr 


1-A ' 



where 

. 2 

A = -cs 2 . 

Specifically, let Ss(r) and s 2 (r) be the solutions to ©, and let r be the smallest positive root of s 2 (r) = 0. 
Then the following event holds with high probability: Ss(R) — s^{R/n)n+o(n) and S 2 (R) = s 2 (R/n)n+o(n), 
with s 2 (R/n)n/3 + o(n) 2-color vertices of each color pair, uniformly for all R with < R/n < ro. Since 
Achlioptas and Molloy 2 showed that 3-GL succeeds with positive probability, this event holds with high 
probability even when we condition on the event that 3-GL succeeds. 

We briefly review how the differential equations @ are derived. The idea is that each round can be 
modeled by a branching process in which coloring a vertex v causes some of v's 2-color neighbors to become 
1-color vertices. A priori we have a 3-type branching process, consisting of the 1-color vertices of the three 
colors, with a 3 x 3 transition matrix M whose entries depend on the number of 2-color vertices of the three 
color pairs; for instance, the expected number of red 1-color vertices created by coloring a vertex blue is p 
times the number of red-blue vertices. This results in a system of four coupled differential equations, which 
we omit here. However, since both this system and the initial conditions are symmetric under permutations 
of the colors, its trajectory is symmetric as well, and we can reduce it to the smaller system ©. In that 
case there are with high probability s 2 n/3 + o(n) 2-color vertices of each color pair, so we have 




M = - S2 S2 = A A . (3) 
\ s 2 s 2 / \ A A J 

and M's only nonzero eigenvalue is A, the total expected number of 1-color vertices created per step. 
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If A < 1 this branching process is subcritical, and the expected number of initially 2-color vertices colored 
during a round is 1/(1 — A) — o(l). Here o(l) includes the probability that the graph induced by these vertices 
is not a tree (including the probability that a 0-color vertex is created and the round fails). These vertices 
have an expected number pS^/(l — A) — o(l) of 3-color neighbors; only o(l) of these become 1- or 0-color 
vertices, and the rest become 2-color vertices. Rescaling according to Wormald's theorem then yields the 
differential equations 

To solve it is convenient to change the variable of integration from r to t, where T = tn is the number 
of steps (free and forced) taken so far. Using dt/dr — 1/(1 — A), this gives the original differential equations 
derived in 



dS3 

~dT 



-cs 3 , s 3 (0) = 1 



^ = CS3 -1, S2 (0) = (4) 

The solution to (@J is easily seen to be 

s 3 (t)=e- c \ s 2 (t) = 1 - t - e~ ct (5) 

Maximizing S2(t) shows that A < 1 for all t if and only if c < c* where c* = 3.847... is the smallest positive 
root of c — lnc = 5/2. Using the —1st branch of Lambert's function, defined as W(x) = y where y = xe x , 
we can write c* = — W-i(— e~ 5 / 2 ). 

The number of rounds performed after T steps is with high probability r(T/n)n + o(n), where 

r(t) = ^dt(l-A) = ^(l-e--)+(l-|^+^ 2 . (6) 

We will use this in the proof of Theorem II . II below. 

2.2 Proof of Theorem H7H 

In this section we use the branching process associated with 3-GL to calculate the probability that a given 
round succeeds. As we argued above, conditioning on the state at the beginning of each round makes the 
events that they succeed independent. Taking the product of these probabilities then gives and proves 
Theorem O 

Lemma 2.1 Suppose that the state of a round R contains S2T1/Z + o(n) 2-color vertices of each color pair, 
where A = (2/3)cs2 < 1. Then the probability that R succeeds is 

f(X) 

g S ucccssM = 1 - J -±-t + o(l/n) (7) 
n 

where 

cA 2 

/(A) = 2(1-A) 2 (2 + A) • (8) 

Proof. We associate R with a tree T as follows: let T's edges consist of the pairs u, v such that coloring 
u removes a color from £(v). Then T spans the subgraph induced by the vertices colored, plus any 0-color 
vertices created, during R. We will say that R generates T. 

Now, the probability that R fails is clearly a function of the tree it generates, and the probability it 
generates a given tree is a function only of its state. Since A < 1, the branching process corresponding to R 
is subcritical, and arguments analogous to [3] show that the probability that R generates a given tree differs 
by o(l) from the probability that the branching process generates a tree of the same type. This probability in 
turn is a continuous function of the entries of its transition matrix, and therefore of A. Finally, since the size 
t of the tree generated by a subcritical branching process has an exponential tail, its second moment E[£ 2 ] 
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is finite; since R fails with probability at most pt 2 = 0(t 2 /n), averaging over all trees gives a probability of 
failure 0(l/n), justifying the scaling inherent in Q. 

We now calculate /(A), i.e., n times the probability that a round fails, within the branching process 
model. First, suppose a round starts (on its free step) by coloring a vertex red. Then, using ©, the 
expected number of 1-color vertices of each color generated by the round is [3] 

(1 - M) "'(i) = (T^y(Y) 

i.e., (2 — A)/ ((1 — A) (2 + A)) red vertices (including the initial one) and A/((l — A) (2 + A)) each of the other 
two colors. (Note that the total expected number of vertices is 1/(1 — A), but their colors are correlated with 
the color of the initial vertex.) 

Now, it is not the case that the probability of failure in a round is p times the number of pairs of 1-color 
vertices with the same color in T. For instance, if a red 1-color vertex u is colored before v becomes a red 
1-color vertex, u and v cannot be connected, since if they were v would have become a 1-color vertex (of 
a different color) when we colored u. The only "dangerous" pairs where coloring u might make v a 0-color 
vertex are those where v is present, but not yet colored, when we color u. 

It is easy to see that whether or not a round fails does not depend on the order in which we color the 
1-color vertices (although which vertex becomes a 0-color vertex does). Therefore, although 3-GL chooses 
from the 1-color vertices randomly, we can assume instead that we always color the youngest 1-color vertex, 
and thus perform a depth-first traversal of the tree T . A little reflection shows that the dangerous pairs are 
then those u, v where v is an older sibling of u, an "uncle" which is older than u's parent, or a great-uncle 
older than u's grandparent, and so on. In Figure (2J we show part of a round, and connect the dangerous 
pairs with dotted lines. If there are D such pairs, the expected probability that the round succeeds is then 
E[(l -p) D ] = 1 - cE[D]/n + o(l), so /(A) = cE[D}. 




Figure 2: The dangerous pairs in a round. The root is the vertex chosen on the free step, and siblings are 
ordered with the youngest on the left. 

Each vertex v in the branching process has a number of children m which is Poisson-distributed with 
mean A, and the number of dangerous pairs below v includes those below each of its children. In addition, 
if v is red, its children are green and blue; given a pair of siblings x and y where x is younger, the number 
of additional dangerous pairs is either the number of green descendants at or below x or the number of blue 
ones, depending on y's color. Since the expected number of green or blue descendants at or below a green or 
blue vertex is 2/ ((1 — A)(2 + A)), y takes each of these colors with probability 1/2, and the expected number 
of pairs of siblings is E [('")] = A 2 /2, we have 

EW = AE[D] + 1^ (1 _ A) 2 (2 + A) 



G 



and so 

A 2 

E[ ^ ] = 2(1-A) 2 (2 + A) 

Setting /(A) = cE[D] gives (JSJ) and completes the proof. □ 
Lemma 12 . II then implies the following. 

Lemma 2.2 Let 1 < c < c* = 3.847... TTie probability that 3-GL succeeds on G(n,c/n), and that algorithms 
A and B color G(n,c/n) without backtracking, is 



P c (0) - exp ^- jf °dr/ 



/(A(r))j +o(l) (9) 

where /(A) is given by JSJ, A(r) = (2/3)cs2(r), S2(?~) is solution of l|2[l. and ro is t/ie smallest positive 
root of S2{r) = 0. 

Proof. Given Lemma l2~D and including the o(l) probability that the state is not within o(n) of that predicted 
by the differential equations for all r, we can write 

ysucccss 

(P/n)^ +o(l) 
= Ilexp (-^^ + 0(1/^+0(1) 
= cxp^-i^/(A(P/n))^ +o(l) 

= exp(- £° drf(\(r))\ + o(l) . 

In the second line we used ln(l — x) = — a; + 0(a; 2 ), and in the last line we used the fact that /(A(r)) is 
bounded and differentiable as long as A(r) < 1. □ 

Finally, we obtain Q from © by changing the variable of integration from r to t. Since d£/dr = 1/(1 — A), 
this gives 

p '(°> - exp (- r d * 2(i-1k2+ a) ^ 

Here A = (2/3)cs 2 (i) and is the time at which we complete the giant component, or equivalently, the 
first time after we start coloring the giant component at which the number of 2-color vertices becomes zero. 
Using JSJ, this is the smallest positive root of 

s 2 (t) = 1 -t-c- ct = (11) 

completing the proof of Theorem 11.21 □ 

We have not found a closed form for the integral in QJ. However, Figure [3] compares values of P c (0) 
obtained by integrating Q numerically with experimental data for graphs of size n — 10 4 , and they are in 
excellent agreement. 



2.3 Another approach: the distribution of 1-color vertices 

In this section, we look at a heuristic calculation of P c (0) in which we consider the steps of 3-GL one at a 
time, rather than in rounds. This method is analytically simpler than that in the previous section. However, 
to make it rigorous, we would need to deal with the fact that the events that a pair of nearby steps fail are 
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Probability of success without backtracking 




Figure 3: A comparison of our calculation 0) of the probability P c (0) of success without backtracking (the 
solid line) with experimental results (the stars) as a function of c. The experiments consisted of 10 4 trials 
for each value of c, on graphs of size n = 10 . 



positively correlated if they occur in the same round; for instance, they are both more likely to fail if that 
round colors many vertices. (One way to remove this correlation would be to note that the probability that 
more than one step in a given round fails is 0(polylog(n)/n 2 ), so taking a union bound over the 0(n) steps, 
with probability 1 — o(l) no two steps fail in the same round.) 

We start by calculating the probability distribution p(x) of the number of 1-color vertices that are present 
at a given time, since the probability that two of these are neighbors is essentially p times its second moment. 
If we think of 3-GL in single steps rather than in rounds, Achlioptas and Molloy showed that x obeys a 
biased random walk, where at each step we first decrement x if it is positive (since we color a 1-color vertex 
if one exists) and then increase it by a random variable y which is Poisson-distributed with mean A (since 
we create y new 1-color vertices). 

Since A varies continuously with t, as n — > oo we can assume that A is roughly constant over a large 
number of steps, in which case p(x) will be close to the stationary distribution of this biased random walk. 
We can calculate p{x) using its generating function 

oo 

9(z) =^p{x)z x 

x=0 

In particular, g(0) = p(0) is the probability that there is no 1-color vertex, i.e., that the current step is a 
free step. Since the expected change in x, which is p(0) — 1 + A, must be zero for the stationary distribution, 
we have p(0) = 1 — A. 

Decrementing x by 1 if x > corresponds to dividing g(z) by z except for the z° term, and adding y to 
x corresponds to multiplying g(z) by the generating function of the Poisson distribution, 

Y?-t- z v = e X'-Q . 

y=0 y 
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Thus the effect of each step on g(z) is 

9(z) - + p(0) ) eXC-x) = + (i - A)(, - 1))^ 

and solving for the stationary distribution gives 

(1-A) (z-1) 

= ze -A(,-l) _ i ■ 

The expected number of 1-color vertices present on a given step is then 

L J yK> 2(1 - A) 

and the expected number of 1-color vertices other than the one colored on a given step is 

E[z]-l+p =E[z]-A=-A— (12) 

any of which could conceivably become a 0-color vertex on that step. 

However, the colors of the existing 1-color vertices are correlated with each other, so we can't simply 
divide l|12[l by 3. Thinking back to the tree T generated by a round, if two colors are k steps apart in the 
tree, then the probability that they are the same color is given by 

l-F(k-l) 1 / / V k ' 
F(k) = L_^ = -(l + 2 ' 

e.g. F(Q) = 1, F(l) = (since edges in T only connect 1-color vertices of different colors), F(2) = 1/2, and 
so on. 

In a branching process of branching ratio A, the average number of vertices k steps away from a given 
vertex is X k . Summing over all k > 1 and dividing by 2 since we are counting each pair of vertices twice, the 
expected number of partners forming a dangerous pair with the vertex colored on a given step is 

-.OO OO - OO y \ k 



2^ v/ 6^ 3^\2 

fe=l k=l k=l 

1 A 1 A 



61-A 32+A 
A 2 

2(1-A)(2 + A) ' 

Multiplying this by p = c/n and integrating over the steps < t < to gives the same integral for the expected 
number of 0-color vertices created while coloring the giant component as in ijTjl. 

2.4 The singularity at c* and the difficulty of numerical experiments 

As c approaches c*, the maximum value of A approaches 1, and the integral in Q diverges. To isolate the 
nature of this divergence, we expand the integrand in terms of partial fractions, which gives 

- lnP c( 0) = C - f° dt (J- - *±*£) ^ f A ~ 0(1) . (13) 
W 6 Jo \1-X 2 + A J 6 Jo 1-A y J y 1 

Given and JSJ, S2 and A are maximized at t max = (In c)/c. Expanding A as a Taylor series in t around 
tmax gives 

dt f dt 7r , 

(14) 



1-A J 1 - A max - (l/2)A"(t - t max ) 2 VI - A max y^A 7 V2 
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where 



and 



A" = --c 2 
3 



A max = ~(c- lnc- 1) 



Let c = c* — e where c* is the unique positive root of c — lnc = 5/2 2 . To leading order in e, we have 



1-An 



dc 



2(c* - 1) 



3c* 



and l(T3j) and ifHjl give 



where 



-lnP c (0) = — -0(1) 




2 V 2(c* - 1) 



1.29 . 



Thus the probability of success is given by 



lim P c (0) = expf-A/v^) 0(1) 

which goes to zero faster than any analytic function as e — > 0. In particular, all of its derivatives with respect 
to c are zero at c*. 

While the threshold c* below which 3-GL succeeds with positive probability can be determined analyt- 
ically, more sophisticated heuristics often require numerical experiments — if only to confirm a long and 
involved journey through a large system of coupled differential equations. However, since P c (0) approaches 
zero very rapidly as c — * c*, the number of trials we have to do to confirm that 3-GL succeeds with positive 
probability increases very rapidly. Using methods from statistical physics, Deroulers and Monasson 8 found 
the same critical behavior for heuristics on random 3-SAT; we expect a similar pattern for other heuristics, 
such as the smoothed Brelaz heuristic analyzed by Achlioptas and Moore 3 which succeeds for c < 4.03. 

To illustrate this, in Table ^ we show P c (0) for various values of c. Note that to measure c* to one, two, 
or three decimal digits, we need to do roughly 10 2 , 10 6 , and 10 28 trials! On a practical level, this means that 
numerical experiments will systematically underestimate the threshold below which a heuristic of this type 
succeeds with positive probability. 



c 


-lnP c (0) 


P c (0) 


3.8 


4.569 


0.0104 


3.84 


13.654 


1.176 x 10~ 6 


3.847 


63.467 


2.733 x 10~ 28 



Table 1: The rapid decrease of P c (0) as c approaches c* w 3.8474. 



3 Exponential backtracking with positive probability 

In this section we prove Theorem 11.11 establishing rigorously that the number of backtracks of DPLL on 
random graphs with degree 1 < c < c* has a heavy tail. 

Proof of Theorem II. U We focus on algorithm A first, in which each vertex is given an index in a fixed 
random order. Let t\ be a constant such that < t% < to where to is given by (|llf> . Run the algorithm 
for tin steps, and then continue until the end of the current round (which takes with high probability o(n) 
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more steps), conditioning on not having created a 0-color vertex so far. This is equivalent to running 3-GL 
conditioned on its success, so as discussed above, at the end of these tin + o(n) steps there are with high 
probability S3(ti)n + o(n) 3-color vertices and S2(ti)/3 + o(n) 2-color vertices of each color pair, where s^fti) 
and S2(tx) are given by (J5J. In addition, the uncolored part of the graph G' is uniformly random in G(n',p) 
where n' is the total number of uncolored vertices. 

Let us call a triangle bad if it is composed of 2-color vertices whose allowed colors are red and green, it is 
disconnected from the rest of G", and the indices of its vertices are all greater than the median index of the 

2- color vertices in G'. Now, let E\ be the event that G' contains exactly one bad triangle. It is easy to see 
that the distribution of the number of bad triangles is within o(l) of a Poisson distribution with expectation 

8 V 3 J F v y > 1296 v ' 

Then E\ occurs with probability q\ = mcT m + o(l) > 0. 

Let us call this triangle A. It is important to us in the following ways: 

1. It is not 2-colorable, so every branch of this subtree will fail, and the algorithm will be forced to 
backtrack at least to the (iin)th step and uncolor one of A's blue neighbors. 

2. Since A is isolated from rest of G", we will find this contradiction only if we choose one of A's vertices 
from the pool of 2-color vertices; we will not be led to A by a chain of forced steps. 

3. When running A, we won't choose any of A's vertices until we run out of 2-color vertices of lower index, 
and this will not happen until we have taken at least S2{ti)n/2 more steps. 

In other words, A will cause the entire subtree starting with these tin steps to fail, but we won't find out 
about it until we explore the tree O(n) more deeply, and visit an exponential number of nodes. 

To formalize this, let t 2 be a constant such that < t% < S2{t\)/2, and consider running the algorithm 
for another t2n steps. This produces a search tree of depth t 2 n, where each internal node corresponding 
to a forced or free step has one or two children respectively. If we choose a random branch of this tree by 
following the two branches with equal probability each time we come to a free step, this is equivalent to 
running 3-GL on the graph G" = G' \ A. Each leaf of the tree corresponds either to creating a 0-color vertex 
and backtracking, or having run for t^n steps without creating a 0-color vertex. We call these "bad" and 
"good" leaves respectively. 

We will abuse notation by letting G{n' — 3,p) denote a random graph with three fewer red-green vertices 
than G'. Once we condition on the number of uncolored vertices of each color list in G' and on the event that 
E\ occurs, G" is uniformly random in G{n' — 3,p) except for the condition that it has no bad triangles (it is 
easy to see this given the structure of G(n, p) as a product space) . The progress of 3-GL on G(n' — 3, p) is still 
given by the differential equations (0J, since removing three vertices changes the rescaled variables by o(l). 
Since there is one free step per round, the number of free steps performed by t2n steps of 3-GL on G(n' — 3, p) 
is with high probability an + o{n) where a — r(t 2 ) ~ r(ti) and r(t) is given by ©. But, the event that 
G(n' — 3,p) has no bad triangles occurs with probability e~ m + o(l) = 0(1), so conditioning on this event 
the number of free steps performed by 3-GL on G" is still with high probability an + o(n). Furthermore, 

3- GL succeeds on G(n' — 3,p) for t2n steps with probability at least P c (0), and since this success implies the 
condition that G{n' — 3,p) has no bad triangles, 3-GL succeeds on G" with probability P > P c (0)- 

Let us transform the search tree to a binary tree T with the same number of leaves, by replacing each 
chain of forced steps with a single edge, and leaving just the internal nodes corresponding to free steps. The 
depth of a leaf is now the number of free steps on the way to it, and a run of 3-GL samples a given leaf 
at depth i with probability 2~\ Let M be the average depth of a good leaf according to this probability 
distribution; then with high probability M = an + o(n), and the total probability of the good leaves is P. 

We wish to prove a lower bound on the number of leaves. If T were perfectly balanced, this would be 
easy; but unfortunately M is not exponentially concentrated, so the depth of the leaves can vary significantly. 
Therefore, we employ the following lemma. 
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Lemma 3.1 Let T be a binary tree. Assign a probability 2~ l to each leaf at depth i, and label each leaf 
"good" or "bad." Let M be the average depth of the good leaves, and let P be their total probability. Then 
there are at least P2 M good leaves. 

Proof of Lemma 13.11 Let N be the number of good leaves; we prove the lemma by induction on the size 
of the tree. For the base case, a tree consisting of a single vertex has P = 1, M = and N = 1 if it is good, 
and P = and N = if it is bad. 

Now assume inductively that the lemma is true for T's subtrees. Let Ng and N r denote the number of 
good leaves of the left and right subtrees, Mg and M r their average depth (measured from the subtrees' 
roots) and Pg and P r their conditional probabilities. Then we have N — Ng + N r , P = (Pg + P r )/2, and 

PgMg + P r M r 
M = 1 1 r - + 1 . 
2P 

Let p, q > and p + q = 1. Then for any A, B > we have 

P A + qB> A p B q (15) 

i.e., the weighted arithmetic mean is at least as large as the weighted geometric mean. Then taking p = 
Pg/(2P) and q = P r /(2P), we have 



N = Ng + N, 
°g 



> P f 2 Mt + P r 2 M '' 



> (2P) 2 
= P2 M 



(P e M e +P r M r )/(2P) 



□ 

Lemma 13.11 and the above arguments imply that with probability q\ — o(l), A will backtrack at least 
P c (0) 2 Q ™-°(™) = 2 Qn ~°(") times. Taking any q < q x and any (3 < a completes the proof for A. 

The proof for algorithm B is similar. We remove the condition on A's indices (removing the factor of 1/8 
from m above). However, the branches of T can now fail cither because 

1. 3-GL creates a 0-color vertex while running on G" , or 

2. the algorithm chooses to branch on one of A's vertices. 

Let s!f m = min tl <i< t2 s^it). Then the probability that one of A's vertices is chosen on a given step is with 
high probability at most 3/(s™ m n + o(n)), and the probability a branch fails for the second reason is at most 
3£ 2 /s™ in + o(l). The probability of the good leaves is then P > P c (0) - 3t 2 /sf in - o(l), and by taking t 2 
sufficiently small we can ensure that P > 0. Thus B also backtracks 2 an -°^ times with probability q 1 -o(l). 
We again take q < q\ and /3 < a, and the proof is complete. 



4 Discussion 

We have shown that DPLL algorithms take exponential time with positive probability for random graphs 
G(n,c/n), even in the "easy" range 1 < c < 3.847... where with positive probability they color the graph 
with no backtracking at all. This happens because the events that different branches of the search tree fail 
are far from independent; since a single bad triangle A dooms an entire subtree to failure, the probability all 
its branches fail is positive even though a random branch succeeds with positive probability. The algorithm 
then tries to 2-color A an exponential number of times, naively hoping that recoloring other vertices will 
render A 2-colorable. In terms of restarts, once A has "spoiled" an entire section of the search space, it 
makes more sense to start over with a new random branch. 
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Experimentally, Figure 2] shows that the distribution of the number of backtracks follows a power law 
P c (b) ~ b^ 1 . It might be possible to strengthen Theorem ll.ll to prove this power-law behavior in the following 
way: suppose for the sake of argument that A appears at a uniformly random depth d between 1 and n, and 
that the running time b is exactly 2 Ad for some A. Then the probability that b is between 2 Ad and 2 A ( d+1 ^ is 
1/n, giving a probability density P c (b) = l/(2 Ad (2 A - l)n) ~ 1/6. Of course, d is not uniformly distributed, 
but any distribution which varies slowly from 0(1) to 0(n) would give the same qualitative result. The 
difficulty is determining how d is distributed, and then better understanding the distribution of b: however, 
bounding 6's variance, say, seems quite challenging. We propose this as a direction for future work. 
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